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Abstract 

Parikh's theorem states that the Parikh image of a context-free language is 
semilinear or, equivalently, that every context-free language has the same Parikh 
image as some regular language. We present a very simple construction that, 
given a context-free grammar, produces a finite automaton recognizing such a 
regular language. 



The Parikh image of a word w over an alphabet {oi, . . . , a n } is the vector 
(vi, . . . , v n ) G N™ such that Vi is the number of occurrences of a, in w. For 
example, the Parikh image of 01010202 over the alphabet {ai, 02, 03} is (2, 2, 0). 
The Parikh image of a language is the set of Parikh images of its words. Parikh 
images are named after Rohit Parikh, who in 1966 proved a classical theorem 
of formal language theory which also carries his name. Parikh's theorem [1] 
states that the Parikh image of any context-free language is semilinear. Since 
semilinear sets coincide with the Parikh images of regular languages, the theorem 
is equivalent to the statement that every context-free language has the same 
Parikh image as some regular language. For instance, the language {a n b n | n > 
0} has the same Parikh image as (ab)* . This statement is also often referred 
to as Parikh's theorem, see e.g. [10], and in fact it has been considered a more 
natural formulation [14]. 

Parikh's proof of the theorem, as many other subsequent proofs [8, 14, 13, 
9, 10, 2], is constructive: given a context-free grammar G, the proof produces 
(at least implicitly) an automaton or regular expression whose language has 
the same Parikh image as L(G). However, these constructions are relatively 
complicated, not given explicitly, or yield crude upper bounds: automata of size 
0(n n ) for grammars in Chomsky normal form with n variables (see Section 4 
for a detailed discussion). In this note we present an explicit and very simple 
construction yielding an automaton with 0(A n ) states, for a lower bound of 2™. 
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An application of the automaton is briefly discussed in Section 3: the automaton 
can be used to algorithmically derive the semilinear set, and, using recent results 
on Parikh images of NFAs [16, 11], it leads to the best known upper bounds on 
the size of the semilinear set for a given context-free grammar. 

1. The Construction 

We follow the notation of [3, Chapter 5]. Let G — (V, T, P, S) be a context- 
free grammar with a set V = {Ax, . . . , A n } of variables or nonterminals, a set T 
of terminals, a set P C V x (V U T)* of productions, and an axiom S G V. We 
construct a nondeterministic finite automaton (NFA) whose language has the 
same Parikh image as L(G). The transitions of this automaton will be labeled 
with words of T* , but note that by adding intermediate states (when the words 
have length greater than one) and removing e-transitions (i.e., when the words 
have length zero) , such an NFA can be easily brought in the more common form 
where transition labels are elements of T. 

We need to introduce a few notions. For a £ (V U T)* we denote by Ily(o) 
(resp. Ut(o:)) the Parikh image of o where the components not in V (resp. T) 
have been projected away. Moreover, let a/y (resp. o/t) denote the projection 
of o onto V (resp. T). For instance, if V = {Ax,A 2 }, T = {a,b,c}, and 
a = aA 2 bAiAx, then IL v (a) = (2,1), n T (o) = (1,1,0) and o/ T = ab. A 
pair (a, 13) £ (V U T)* x (V U T)* is a step, denoted by a /3, if there 
exist a\,ct2 € (V U T)* and a production A — > 7 such that a — a\Acti and 
(3 = 0:170:2. Notice that given a step a j3, the strings 01,02 and the 
production A — > 7 are unique. The transition associated to a step o (3 is the 
triple i(o => f3) = (Ily(a), j/ T , U v (/?)). For example, if V = {Ai,A 2 ,A 3 } and 
T = {a, b}, then t{A 2 aAx => A 2 aA 2 bA 3 ) = ((1, 1, 0), b, (0, 2, 1)). 

Definition 1.1. Let G — (V,T, P, S) be a context-free grammar and let n = 
\V\. The k-Parikh automaton of G is the NFA Af£ = (Q, T*, <5, q , { 9/ }) defined 
as follows: 

• Q = {(x 1 ,...,x„)eN"|IXi^< fc }; 

• 5 = {i(o /3) I o /3 is a step and Ily (a), IL/(/3) e Q}; 

• g = LTy(5); 

• g/=IIv(e) = (0,...,0). 

It is easily seen that Mq has exactly (™^ fe ) states. 

Figure 1 shows the 3-Parikh automaton of the context-free grammar with pro- 
ductions A\ — > AiA 2 \a, A 2 — > bA 2 aA 2 \cAi and axiom A\. The states are 

all pairs {x\,x 2 ) such that x\ + x 2 < 3. Transition (0,2) — (0,3) comes e.g. 
from the step A 2 A 2 =>■ bA 2 aA 2 A 2 , and can be interpreted as follows: applying 
the production v4 2 — > bA 2 aA 2 to a word with zero occurrences of Ai and two 
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Figure 1: The 3-Parikh automaton of A\ -> A\A2\a, A 2 — > bA 2 aA2\cAi with S = A\. 

occurrences of A 2 leads to a word with one new occurrence of a and b, zero 
occurrences of Ax, and three occurrences of A 2 . 

We define the degree of G by m : = —1 + max{|7/y| : (A —> 7) e P}; i.e., 
m + 1 is the maximal number of variables on the right hand sides. For instance, 
the degree of the grammar in Fig. 1 is 1. Notice that if G is in Chomsky normal 
form then to < 1, and m < iff G is regular. 

In the rest of the note we prove: 

Theorem 1.1. If G is a context-free grammar with n variables and degree m, 
then L(G) and L(Mg m+1 ) have the same Parikh image. 

For the grammar of Figure 1 we have n = 2 and m — 1, and Theorem 1.1 
yields L(G) = L(Mq). So the language of the automaton of the figure has the 
same Parikh image as the language of the grammar. 

It is easily seen that Mq has exactly ( n ~^ k ) states. Using standard prop- 
erties of binomial coefficients, for M^ m+1 and to > 1 we get an upper bound 
of 2 • (m + 1)™ • e™ states. For to < 1 (e.g. for grammars in Chomsky nor- 
mal form), the automaton Mg +1 has ( 2n + 1 ) < 2 2n+1 e 0(4") states. On 
the other hand, for every n > 1 the grammar G n in Chomsky normal with 
productions {Ak — > Ak-x Ak-x I 2 < k < n} U {Ax — > a} and axiom S = A n 
satisfies L(G n ) — |a 2 j, and therefore the smallest Parikh-equivalent NFA 
has 2™^ 1 + 1 states. This shows that our construction is close to optimal. 

2. The Proof 

Given L%, Li C T* , we write Lx =n L2 (resp. Lx Qn L2), to denote that the 
Parikh image of Lx is equal to (resp. included in) the Parikh image of L 2 . Also, 
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given w,w' £ T* , we abbreviate {w} —u {w 1 } to w =u w'. 

We fix a context-free grammar G = (V, T, P, S) with n variables and de- 
gree m. In terms of the notation we have just introduced, we have to prove 
L(G) = n L{M% m+l ). One inclusion is easy: 

Proposition 2.1. For every k > 1 we have L(Mq) C n L(G). 

Proof. Let k > 1 arbitrary, and let qo -^-> g be a run of Mg on the word a £ 
T* . We first claim that there exists a step sequence S a satisfying Hy(a) = q 
and Hxia) = ^((t). The proof is by induction on the length £ of go Q- 
li £ — 0, then cr = e, and we choose a = S, which satisfies Ily(S) = qo and 

n T (S) = (0, . . . ,0) = n T (e). If £ > 0, then let a = a'f and q ^ q' ^ q. 
By induction hypothesis there is a step sequence S =>* a' satisfying Ily(a') = c/ 
and ily(a') = ILp(cr'). Moreover, since c/ — » q is a transition of Mq, there is 
a production A — ¥ 7' and a step a±Aa 2 =>■ ai7a2 such that IL/(ct!ij4a2) = 9', 
IIy(ai7'a2) = c/ and 7y T = 7. Since lly(a') = q' = H.y(a\Aa,2), a' contains 
at least one occurrence of A, i.e, a! = a\Ad^ for some a' 1: a' 2 . We choose 
a = a'^'a'z, and get U v (a) = n y (ai7'a 2 ) = Hv^Ao^) - IL V (A) + IIy(Y) = 
ILy(a') - n v (A) + IL/( 7 ') = U v (a 1 Aa 2 ) ~ U V (A) + IL/( 7 ') = u v ( ai ya 2 ) = 
q. Also II r (a) = II r (a<i7W 2 ) = n r (aiAa4) + n T ( 7 ') = U T (a') + II T (Y) = 
n T (a') + n T (7') = n T (a') + II T (7) = n T (cr). This concludes the proof of the 
claim. 

Now, let a be an arbitrary word with a £ L(Mq). Then there is a run 
c/o —¥ ily(e). By the claim there exists a step sequence S =>* a satisfying 
Ely (a) = (0,...,0) and U T (a) = U T {a). So a £ T* , and hence a £ L(G). 
Since Ht(c() = Ht(<t) we have a =n o~, and we are done. □ 

The proof of the second inclusion L(G) C n L(Mg m+1 ) is more involved. To 
explain its structure we need a definition. 

Definition 2.1. A derivation S = ao =>■■■=$■ at of G has index k if for every 
i £ {0, . . . , £}, the word (ai) /y has length at most k. The set of words derivable 
through derivations of index k is denoted by Lk{G). 

For example, the derivation A\ => A\A 2 => A\cA\ => A\ca =>■ aca has index 
two. Clearly, we have L X (G) C L 2 (G) C L 3 (G) ... and L(G) = U fc >i ife(G). 

The proof of L(G) C n L(Af" m+1 ) is divided into two parts. We first prove 
the Collapse Lemma, Lemma 2.3, stating that L(G) C n L„ m+ i(G), and then 
we prove, m Lemma 2.4, that L k (G) C n L{M%) holds for every k > 1. A similar 
result has been proved in [7] with different notation and in a different context. 
We reformulate its proof here for the reader interested in a self-contained proof. 

The Collapse Lemma. We need a few preliminaries. We assume the reader is 
familiar with the fact that every derivation can be parsed into a parse tree [3, 
Chapter 5] , whose yield is the word produced by the derivation. We denote the 
yield of a parse tree t by Y(t), and the set of yields of a set T of trees by V(7~). 
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Figure 2: A parse tree of Ai — > A\A2\a, A2 — > bA20,A2\cAi with S = A\ 



Figure 2 shows the parse tree of the derivation A\ =>■ A\A 2 =>■ (1A2 =>■ abA\ => 
aba. We introduce the notion of dimension of a parse tree. 

Definition 2.2. Let t be a parse tree. A child of t is a subtree of t whose root is 
a child of the root of t. A child of t is called proper if its root is not a leaf, i.e., if 
it is labeled with a variable. The dimension d(t) of a parse tree t is inductively 
defined as follows. If t has no proper children, then d(t) = 0. Otherwise, let 
t\,t2, ■ ■ ■ ,t r be the proper children of t sorted such that d(ti) > dfo) > ■ ■ ■ > 
d(t r ). Then 



The set of parse trees of G of dimension k is denoted by T^ k \ and the set of all 
parse trees of G by T. 

The parse tree of Fig. 2 has two children, both of them proper. It has dimension 
1 and height 3. Observe also the following fact, which can be easily proved by 
induction. 

Fact 2.1. Denote by h(t) the height of a tree t. Then h(t) > d{t). 

For the proof of the collapse lemma, L(G) C n L nm+ i(G), observe first that, 
since every word in L(G) is the yield of some parse tree, we have L(G) = 
Y(T), and so it suffices to show Y(T) C n L nm+ i(G). The proof is divided into 
two parts. We first show Y(T) Qn U"=o ^(T - ^) m Lemma 2.1, and then we 
show U"=o Y(T^ ) L nm+ i{G) in Lemma 2.2. Actually, the latter proves the 
stronger result that parse trees of dimension k > have derivations of index 
km + 1, i.e., F(T (fc) ) C L km+1 (G) for all k < 0. 



Proof. In this proof we write t = t\ ■ t% to denote that t\ is a parse tree except 
that exactly one leaf £ is labelled by a variable, say A, instead of a terminal; 
the tree t 2 is a parse tree with root A; and the tree t is obtained from ti and t 2 
by replacing the leaf £ of t\ by the tree t 2 . Figure 3 shows an example. 



d{t) = 



d(t-i) if r = 1 or d(tx) > d(t 2 ) 

+ l if d{h) = d(t 2 ). 



Lemma 2.1. Y(T) C n (JlLo Y{T®). 
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a a 
Figure 3: A decomposition ti, t2 such that t = ti • t2 is the parse tree of Fig. 2 

In the rest of the proof we abbreviate parse tree to tree. We need to prove 
that for every tree t there exists a tree t! such that Y(t) =n Y(t') and d(t r ) < n. 
We shall prove the stronger result that moreover t and t' have the same number 
of nodes, and the set of variables appearing in t and t! coincide. 

Say that two trees t, t' are ^.-equivalent if they have the same number of 
nodes, the sets of variables appearing in t and t' coincide, and Y(t) =n Y(t') 
holds. Say further that a tree t is compact if d(t) < K(t), where K(t) denotes 
the number of variables that appear in t. Since K (t) < n for every t, it suffices 
to show that every tree is fi-equivalent to a compact tree. We describe a re- 
cursive "compactification procedure" Compact (t) that transforms a tree t into 
an ri-cquivalcnt compact tree, and prove that it is well-defined and correct. By 
well-defined we mean that some assumptions made by the procedure about the 
existence of some objects indeed hold. 

Compact(t) consists of the following steps: 

(1) If t is compact then return t and terminate. 

(2) If t is not compact then 

(2.1) Let ti, . . . , t r be the proper children of t, r > 1. 

(2.2) For every 1 < i < r: ti := Compact(ti). 

(I.e., replace in t the subtree ti by the result of compactifying ti). 
Let x be the smallest index 1 < x < r such that K(t x ) = maxj K(ti). 

(2.3) Choose an index y ^ x such that d(t y ) — max^ d{ti). 

(2.4) Choose subtrees t%,t x of t x and subtrees ty,ty,ty of t y such that 

(i) t x =t%- t b x and t y = t a y ■ (t b y ■ t c y ); and 

(ii) the roots of t x ,t y and t y are labelled by the same variable. 

(2.5) t x :— t x ■ (t y ■ t h x ) ; t y := ty ■ ty. 

(Loosely speaking, remove t y from t y and insert it into t x .) 

(2.6) Goto (1). 

We first prove that the assumptions at lines (2.1), (2.3), and (2.4) about the 
existence of certain subtrees hold. 

(2.1) If t is not compact, then t has at least one proper child. 
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Assume that t has no proper child. Then, by the definitions of dimension and 
Kit), we have d(t) = < K(t), and so t is compact. 

(2.3) Assume that t is not compact, has at least one proper child, and all its 
proper children are compact. Let x be the smallest index 1 < x < r such that 
K(t x ) = maxiK(ti). There there exists an index y ^ x such that d(t y ) — 
max; d(U). 

Let 1 < y < r (where for the moment possibly x = y) be an index such that 
d(t y ) = maxi<i(ti). We have 



so all inequalities in (1) are in fact equalities. In particular, we have d(t) = 
d(t y ) + 1 and so, by the definitions of dimension and of y, there exists y' ^ y 
such that d(t y >) = d(t y ). Hence x ^ y or x ^ y' , and w.l.o.g. we can choose y 
such that y =/= x. 

(2.4) Assume that t is not compact, all its proper children are compact, and 
it has two distinct proper children t x ,t y such that K(t x ) — maxiK(ti) and 
d(t y ) — ma,Xid(ti). There exist subtrees t x ,t x of t x and subtrees ty,t y ,t y of t y 
satisfying conditions (i) and (ii). 

By the equalities in (1) we have K{t y ) = d{t y ). By Fact 2.1 we have d(t y ) < 
h{t y ). So K(t y ) < h(t y ), and therefore some path of t y from the root to a leaf 
visits at least two nodes labelled with the same variable, say A. So t y can be 

factored into tf, ■ (t b . -tf.) such that the roots of t b and t?, are labelled by A. Since 

y v y v y y J 

by the equalities in (1) we also have K(t) — K(t x ), every variable that appears 
in t appears also in t x , and so t x contains a node labelled by A. So t x can be 
factored into t x = t x ■ t b x with the root of t b x labelled by A. 

This concludes the proof that the procedure is well-defined. It remains to 
show that it terminates and returns an ^-equivalent compact tree. We start by 
proving the following lemma: 

// Compact(t) terminates and returns a tree t' , then t and t' are SI- equivalent. 

We proceed by induction on the number of calls to Compact during the execution 
of Compact(t). If Compact is called only once, then only line (1) is executed, t 
is compact, no step modifies t, and we are done. Assume now that Compact is 
called more than once. The only lines that modify t are (2.2) and (2.5). Consider 
first line (2.2.). By induction hypothesis, each call to Compact(ti) during the 
execution of Compact (t) returns a compact tree t\ that is Jl-equivalent to U. Let 
t\ and t2 be the values of t before and after the execution of U :— Compact(ti). 



d(t) < d{t y ) + 1 



(by definition of dimension and of y) 
(as t y is compact) 
(by definition of x) 
(as t x is a child of t) 
(as t is not compact), 



< K[ty) + 1 

< K{t x ) + 1 

< K[i) + 1 

< d(t) 



(1) 
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Then is the result of replacing U by t[ in t\. By the definition of f2-equivalence, 
and since t[ is il-equivalent to U, we get that <2 is f2-equivalent to t\. Consider 
now line (2.5), and let t\ and ti be the values of t before and after the execution 
of t x :=t%- (t b y ■ t b x ) followed by the execution of t y := t° ■ t° y . Since the subtree t b y 
that is added to t x is subsequently removed from t y , the Parikh-image of Y(t), 
the number of nodes of t, and the set of variables appearing in t do not change. 
This completes the proof of the lemma. 

The lemma shows in particular that if the procedure terminates, then it 
returns an ^-equivalent tree. So it only remains to prove that the procedure 
always terminates. Assume there is a tree t such that Compact(t) does not 
terminate. W.l.o.g. we further assume that t has a minimal number of nodes. 
In this case all the calls to line (2.2) terminate, and so the execution contains 
infinitely many steps that do not belong to any deeper call in the call tree, 
and in particular infinitely many executions of the block (2.3)-(2.5). We claim 
that in all executions of this block the index x has the same value. For this, 
observe first that, by the lemma, the execution of line (2.2) does not change 
the number of nodes or the set of variables occurring in each of ii, . . . ,t r . In 
particular, it preserves the value of K(ti), . . . , K(t r ). Observe further that each 
time line (2.5) is executed, the procedure adds nodes to t x , and either does not 
change or removes nodes from any other proper children of t. In particular, the 
value of K(t x ) does not decrease, and for every i ^ i the value of K{ti) does 
not increase. So at the next execution of the block the index x of the former 
execution is still the smallest index satisfying K(t x ) = maxjK(tj). Now, since 
x has the same value at every execution of the block, each execution strictly 
decreases the number of nodes of some proper child t y different from t x , and 
only increases the number of nodes of t x . This contradicts the fact that all 
proper children of t have a finite number of nodes. □ 

Lemma 2.2. For every k > 0: Y(T [k) ) C L km+1 (G). 

Proof. In this proof we will use the following notation. If D is a derivation 
a =>■ • • • => a e and w,w' G (V U T)*, then we define wDw' to be the step 
sequence waow' waew'. 

Let f be a parse tree such that d(t) = k. We show that there is a derivation 
for Y(t) of index km + 1. We proceed by induction on the number of non-leaf 
nodes in t. In the base case, t has no proper child. Then we have k = and 
t represents a derivation S => Y(t) of index 1. For the induction step, assume 
that t has r > 1 proper children t 1; . . . , t r where the root of U is assumed to be 
labeled by A^; i.e., we assume that the topmost level of t is induced by a rule 
S -> 70AW71 • • • yr-iA^yr for 7i e T*. Note that r - 1 < m. By definition of 
dimension, at most one child ti has dimension k, while the other children have 
dimension at most k—1. W.l.o.g. assume d(t\) < k and dfo), ■ ■ ■ , d(t r ) < k — 1. 
By induction hypothesis, for all 1 < i < r there is a derivation Di for Y(ti) such 
that Di has index km + 1, and D 2l ■ ■ ■ , D r have index (k — l)m + 1. Define, for 
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each 1 < i < r, the step sequence 

£>; := 7 o^ (1) 7i ' ' ' 7 ! -2A (! - 1) 7 1 -iO,7 i y(( 1+ i)7 1 +i ■ ■ ■ 7r-i^(*r)7r • 

If the notion of index is extended to step sequences in the obvious way, then 
D[ has index km + 1, and for 2 < i < r, the step sequence D[ has in- 
dex (i — 1) + (fc — l)m + 1 < km + 1. By concatenating the step sequences 
S => 7o-A^7i • • • 'jr-iA^jr and D r , D r -i, . . . , D\ in that order, we obtain a 
derivation for Y(t) of index km + 1. □ 

Putting Lemma 2.2 and Lemma 2.1 together we obtain: 

Lemma 2.3. [Collapse Lemma] L(G) C n L nm+1 (G). 

Proof. 

L(G) = Y(T) 

Cn Ur=o^(T W ) (Lemma 2.1) 
C L nm+1 (G) (Lemma 2.2) 

□ 

Lemma 2.4. For every k>l: L k (G) C n L(M%). 

Proof. We show that if S a is a prefix of a derivation of index k then 
Mq has a run go — > Ily(a) such that w € T* and a/y =n u>. The proof is by 
induction on the length i of the prefix. 

i = 0. In this case a = S, and since go = ^lv(S) and 5/t = £we are done. 

i > 0. Since S => l a there exist (3±Ap 2 € (V U T)* and a production A -> 7 
such that S 1 => I_1 ftiAfa =>■ a and ,817,82 = a- By induction hypothesis, there 
exists a run of Mq such that go — ^ ny(/3iA/?2) and {PiAfa) , T =n »i- Then 
the definition of and the fact that 5 1 a is of index k show that there exists 

a transition ^Jh/[fi\A^), ~1iT\ %(«)), hence we find that g ► Ely (a). 

Next we conclude from {P1AP2) /t =n u>i and a = that a/T =n t"i ■ 7/t 

and we are done. 

Finally, if a 6 T* so that 5 a is a derivation, then g Ily(a) = 
(0, . . . , 0) where (0, . . . , 0) is an accepting state and a = a/ T =n w. □ 

We now have all we need to prove the other inclusion. 

Proposition 2.2. L(G) C n L(M™ m+1 ). 

Proof. 

L(G) C n L nm+ i(G) (Collapse Lemma) 
C n L(Mg m+1 ) (Lemma 2.4) 

□ 
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3. An Application: Bounding the Size of Semilinear Sets 

Recall that a set S C N fe , k > 1, is linear if there is an offset b £ N k and 
periods p x , . . . ,Pj G N fe such that 5 = {&+^i=i ^P; | Ai,...,Aj G N}. A set is 
semilinear if it is the union of a finite number of linear sets. It is easily seen that 
the Parikh image of a regular language is semilinear. Procedures for computing 
the semilinear representation of the language starting from a regular expression 
or an automaton are well-known (see e.g. [14]). Combined with Theorem 1.1 
they provide an algorithm for computing the Parikh image of a context-free 
language. 

Recently, To has obtained an upper bound on the size of the semilinear 
representation of the Parikh image of a regular language (see Theorem 7.3.1 of 
[16]): 

Theorem 3.1. Let A be an NFA with s states over an alphabet of I letters. 
Then H(L(A)) is a union of 0(s e + 3f + 3 linear sets with at most I periods; 

the maximum entry of any offset is 0(s 3t+3 £ 4e+6 ) , and the maximum entry of 
any period is at most s. 

Plugging Theorem 1.1 into Theorem 3.1, we get the (to our knowledge) best 
existing upper bound on the size of the semilinear set representation of the 
Parikh image of a context-free language. Let G = (V, T, P, S) be a context-free 
grammar of degree m with n = \V\ and t = |T|. Let p be the total number 
of occurrences of terminals in the productions of G, i.e., p = J^x^atP \ a /T\- 
The number of states of Mg m+1 is . Recall that the transitions of 

M™ +1 arc labelled with words of the form 7/7-, where 7 is the right-hand-side 
of some production. Splitting transitions, adding intermediate states, and then 
removing e-transitions yields an NFA with ( n + nm + \ . p states. So we finally 
obtain for the parameters s and I in Theorem 3.1 the values s := (™+"™+ 1 ) . ^ 
and I := t. This result (in fact a slightly stronger one) has been used in [6] to 
provide a polynomial algorithm for a language-theoretic problem relevant for 
the automatic verification of concurrent programs. 

4. Conclusions and Related Work 

For the sake of comparison we will assume throughout this section that all 
grammars have degree m < 1. Given G a context-free grammar with n variables, 
we have shown how to construct an NFA M with 0(4") states such that L(G) 
and L(M) have the same Parikh image. We compare this result with previous 
proofs of Parikh's theorem. 

Parikh's proof [1] (essentially the same proof is given in [15]) shows how to 
obtain a Parikh-equivalent regular expression from a finite set of parse trees of 
G. The complexity of the resulting construction is not studied. By its definition, 
the regular expression basically consists of the sum of words obtained from the 
parse trees of height at most n 2 . This leads to the admittedly rough bound 
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n 2 — 1 

that the regular expression consists of at most C(2 2 ) words each of length 
at most C(2" 2 ). 

Greibach [8] shows that a particular substitution operator on language classes 
preserves semilinearity of the languages. This result implies Parikh's theorem, 
if the substitution operator is applied to the class of regular languages. It is 
hard to extract a construction from this proof, as it relies on previously proved 
closure properties of language classes. 

Pilling's proof [14] (also given in [4]) of Parikh's theorem uses algebraic prop- 
erties of commutative regular languages. From a constructive point of view, his 
proof leads to a procedure that iteratively replaces a variable of the grammar G 
by a regular expression over the terminals and the other variables. This proce- 
dure finally generates a regular expression which is Parikh-equivalent to L(G). 
Van Leeuwen [13] extends Parikh's theorem to other language classes, but, while 
using very different concepts and terminology, his proof leads to the same con- 
struction as Pilling's. Neither [14] nor [13] study the size of the resulting regular 
expression. 

Goldstine [9] simplifies Parikh's original proof. An explicit construction 
can be derived from the proof, but it is involved: for instance, it requires to 
compute for each subset of variables, the computation of all derivations with 
these variables up to a certain size depending on a pumping constant. 

Hopkins and Kozen [10] generalize Parikh's theorem to commutative Kleene 
algebra. Like in Pilling [14] their procedure to compute a Parikh-equivalent 
regular expression is iterative; but rather than eliminating one variable in each 
step, they treat all variables in a symmetric way. Their construction can be 
adapted to compute a Parikh-equivalent finite automaton. Hopkins and Kozen 
show (by algebraic means) that their iterative procedure terminates after (9(3™) 
iterations for a grammar with n variables. In [7] we reduce this bound (by 
combinatorial means) to n iterations. The construction yields an automaton, 
but it is much harder to explain than ours. The automaton has size 0(n n ). 

In [2] Parikh's theorem is derived from a small set of purely equational 
axioms involving fixed points. It is hard to derive a construction from this 
proof. 

In [5] Parikh's theorem is proved by showing that the Parikh image of a 
context-free language is the union of the sets of solutions of a finite number of 
systems of linear equations. In [17] the theorem is also implicitly proved, this 
time by showing that the Parikh image is the set of models of an existential 
formula of Presburger arithmetic. While the constructions yielding the systems 
of equations and the Presburger formulas are very useful, they are also more 
complicated than our construction of the Parikh automaton. Also, neither [5] 
nor [17] give bounds on the size of the semilinear set. 
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